sensor modality
- Europe > Switzerland (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Automobiles & Trucks (1.00)
- (2 more...)
Towards Sensor Data Abstraction of Autonomous Vehicle Perception Systems
Reichert, Hannes, Lang, Lukas, Rösch, Kevin, Bogdoll, Daniel, Doll, Konrad, Sick, Bernhard, Reuss, Hans-Christian, Stiller, Christoph, Zöllner, J. Marius
Abstract--Full-stack autonomous driving perception modules usually consist of data-driven models based on multiple sensor modalities. However, these models might be biased to the sensor setup used for data acquisition. This bias can seriously impair the perception models' transferability to new sensor setups, which continuously occur due to the market's competitive nature. We envision sensor data abstraction as an interface between sensor data and machine learning applications for highly automated vehicles (HAD). For this purpose, we review the primary sensor modalities, camera, lidar, and radar, published in autonomous-driving related datasets, examine single sensor abstraction and abstraction of sensor setups, and identify critical paths towards an abstraction of sensor data from multiple perception configurations.
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.05)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- Automobiles & Trucks (0.88)
- Transportation > Ground > Road (0.55)
- Information Technology > Robotics & Automation (0.55)
Multi-modal Co-learning for Earth Observation: Enhancing single-modality models via modality collaboration
Mena, Francisco, Ienco, Dino, Dantas, Cassio F., Interdonato, Roberto, Dengel, Andreas
Multi-modal co-learning is emerging as an effective paradigm in machine learning, enabling models to collaboratively learn from different modalities to enhance single-modality predictions. Earth Observation (EO) represents a quintessential domain for multi-modal data analysis, wherein diverse remote sensors collect data to sense our planet. This unprecedented volume of data introduces novel challenges. Specifically, the access to the same sensor modalities at both training and inference stages becomes increasingly complex based on real-world constraints affecting remote sensing platforms. In this context, multi-modal co-learning presents a promising strategy to leverage the vast amount of sensor-derived data available at the training stage to improve single-modality models for inference-time deployment. Most current research efforts focus on designing customized solutions for either particular downstream tasks or specific modalities available at the inference stage. To address this, we propose a novel multi-modal co-learning framework capable of generalizing across various tasks without targeting a specific modality for inference. Our approach combines contrastive and modality discriminative learning together to guide single-modality models to structure the internal model manifold into modality-shared and modality-specific information. We evaluate our framework on four EO benchmarks spanning classification and regression tasks across different sensor modalities, where only one of the modalities available during training is accessible at inference time. Our results demonstrate consistent predictive improvements over state-of-the-art approaches from the recent machine learning and computer vision literature, as well as EO-specific methods. The obtained findings validate our framework in the single-modality inference scenarios across a diverse range of EO applications.
- Asia (0.24)
- Europe > France > Occitanie > Hérault > Montpellier (0.05)
- Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
- (4 more...)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
- Europe > Switzerland (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Automobiles & Trucks (1.00)
- (2 more...)
Sensor Model Identification via Simultaneous Model Selection and State Variable Determination
Brommer, Christian, Fornasier, Alessandro, Steinbrener, Jan, Weiss, Stephan
We present a method for the unattended gray-box identification of sensor models commonly used by localization algorithms in the field of robotics. The objective is to determine the most likely sensor model for a time series of unknown measurement data, given an extendable catalog of predefined sensor models. Sensor model definitions may require states for rigid-body calibrations and dedicated reference frames to replicate a measurement based on the robot's localization state. A health metric is introduced, which verifies the outcome of the selection process in order to detect false positives and facilitate reliable decision-making. In a second stage, an initial guess for identified calibration states is generated, and the necessity of sensor world reference frames is evaluated. The identified sensor model with its parameter information is then used to parameterize and initialize a state estimation application, thus ensuring a more accurate and robust integration of new sensor elements. This method is helpful for inexperienced users who want to identify the source and type of a measurement, sensor calibrations, or sensor reference frames. It will also be important in the field of modular multi-agent scenarios and modularized robotic platforms that are augmented by sensor modalities during runtime. Overall, this work aims to provide a simplified integration of sensor modalities to downstream applications and circumvent common pitfalls in the usage and development of localization approaches.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Austria (0.04)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
- Europe > Greece > Attica > Athens (0.04)
In-Hand Object Pose Estimation via Visual-Tactile Fusion
Nonnengießer, Felix, Kshirsagar, Alap, Belousov, Boris, Peters, Jan
-- Accurate in-hand pose estimation is crucial for robotic object manipulation, but visual occlusion remains a major challenge for vision-based approaches. This paper presents an approach to robotic in-hand object pose estimation, combining visual and tactile information to accurately determine the position and orientation of objects grasped by a robotic hand. We address the challenge of visual occlusion by fusing visual information from a wrist-mounted RGB-D camera with tactile information from vision-based tactile sensors mounted on the fingertips of a robotic gripper . Our approach employs a weighting and sensor fusion module to combine point clouds from heterogeneous sensor types and control each modality's contribution to the pose estimation process. We use an augmented Iterative Closest Point (ICP) algorithm adapted for weighted point clouds to estimate the 6D object pose. Our experiments show that incorporating tactile information significantly improves pose estimation accuracy, particularly when occlusion is high. Our method achieves an average pose estimation error of 7.5 mm and 16.7 degrees, outperforming vision-only baselines by up to 20%. We also demonstrate the ability of our method to perform precise object manipulation in a real-world insertion task. In-hand pose estimation describes the process of determining the position and orientation of an object held within a robotic hand.
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
Robust sensor fusion against on-vehicle sensor staleness
Fan, Meng, Zuo, Yifan, Blaes, Patrick, Montgomery, Harley, Das, Subhasis
Sensor fusion is crucial for a performant and robust Perception system in autonomous vehicles, but sensor staleness--where data from different sensors arrives with varying delays--poses significant challenges. T emporal misalignment between sensor modalities leads to inconsistent object state estimates, severely degrading the quality of trajectory predictions that are critical for safety. W e present a novel and model-agnostic approach to address this problem via (1) a per-point timestamp offset feature (for LiDAR and radar both relative to camera) that enables fine-grained temporal awareness in sensor fusion, and (2) a data augmentation strategy that simulates realistic sensor staleness patterns observed in deployed vehicles. Our method is integrated into a perspective-view detection model that consumes sensor data from multiple LiDARs, radars and cameras. W e demonstrate that while a conventional model shows significant regressions when one sensor modality is stale, our approach reaches consistently good performance across both synchronized and stale conditions.
Universal Framework to Evaluate Automotive Perception Sensor Impact on Perception Functions
Current research on automotive perception systems predominantly focusses on either improving the sensors for data quality or enhancing the performance of perception functions in isolation. Although automotive perception sensors form a fundamental part of the perception system, value addition in sensor data quality in isolation is questionable. However, the end goal for most perception systems is the accuracy of high-level functions such as trajectory prediction of surrounding vehicles. High-level perception functions are increasingly based on deep learning (DL) models due to their improved performance and generalisability compared to traditional algorithms. Innately, DL models develop a performance bias on the comprehensiveness of the training data. Despite the vital need to evaluate the performance of DL-based perception functions under real-world conditions using onboard sensor inputs, there is a lack of frameworks to facilitate systematic evaluations. This paper presents a versatile and cost-effective framework to evaluate the impact of perception sensor modalities and parameter settings on DL-based perception functions. Using a simulation environment, the framework facilitates sensor modality testing and parameter tuning under different environmental conditions. Its effectiveness is demonstrated through a case study involving a state-of-the-art surround trajectory prediction model, highlighting performance differences across sensor modalities and recommending optimal parameter settings. The proposed framework offers valuable insights for designing the perception sensor suite, contributing to the development of robust perception systems for autonomous vehicles.
- North America > United States (0.14)
- Europe > United Kingdom (0.14)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
AutoMR: A Universal Time Series Motion Recognition Pipeline
Zhang, Likun, Yang, Sicheng, Wang, Zhuo, Liang, Haining, Shen, Junxiao
In this paper, we present an end-to-end automated motion recognition (AutoMR) pipeline designed for multimodal datasets. The proposed framework seamlessly integrates data preprocessing, model training, hyperparameter tuning, and evaluation, enabling robust performance across diverse scenarios. Our approach addresses two primary challenges: 1) variability in sensor data formats and parameters across datasets, which traditionally requires task-specific machine learning implementations, and 2) the complexity and time consumption of hyperparameter tuning for optimal model performance. Our library features an all-in-one solution incorporating QuartzNet as the core model, automated hyperparameter tuning, and comprehensive metrics tracking. Extensive experiments demonstrate its effectiveness on 10 diverse datasets, achieving state-of-the-art performance. This work lays a solid foundation for deploying motion-capture solutions across varied real-world applications.
- North America > United States > California > Alameda County > Berkeley (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > Northern Ireland > County Down > Belfast (0.04)
- (2 more...)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Beyond Sight: Finetuning Generalist Robot Policies with Heterogeneous Sensors via Language Grounding
Jones, Joshua, Mees, Oier, Sferrazza, Carmelo, Stachowicz, Kyle, Abbeel, Pieter, Levine, Sergey
Interacting with the world is a multi-sensory experience: achieving effective general-purpose interaction requires making use of all available modalities -- including vision, touch, and audio -- to fill in gaps from partial observation. For example, when vision is occluded reaching into a bag, a robot should rely on its senses of touch and sound. However, state-of-the-art generalist robot policies are typically trained on large datasets to predict robot actions solely from visual and proprioceptive observations. In this work, we propose FuSe, a novel approach that enables finetuning visuomotor generalist policies on heterogeneous sensor modalities for which large datasets are not readily available by leveraging natural language as a common cross-modal grounding. We combine a multimodal contrastive loss with a sensory-grounded language generation loss to encode high-level semantics. In the context of robot manipulation, we show that FuSe enables performing challenging tasks that require reasoning jointly over modalities such as vision, touch, and sound in a zero-shot setting, such as multimodal prompting, compositional cross-modal prompting, and descriptions of objects it interacts with. We show that the same recipe is applicable to widely different generalist policies, including both diffusion-based generalist policies and large vision-language-action (VLA) models. Extensive experiments in the real world show that FuSeis able to increase success rates by over 20% compared to all considered baselines.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- North America > United States (0.04)
- (5 more...)